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RESOLUTION ANALYSIS OF IMAGING WITH £i OPTIMIZATION 

LILIANA BORCEA AND ILKER KOCYIGIT * 


Abstract. We study array imaging of a sparse scene of point-like sources or scatterers in a homogeneous medium. For 
source imaging the sensors in the array are receivers that collect measurements of the wave field. For imaging scatterers the 
array probes the medium with waves and records the echoes. In either case the image formation is stated as a sparsity promoting 
optimization problem, and the goal of the paper is to quantify the resolution. We consider both narrow-band and broad¬ 
band imaging, and a geometric setup with a small array. We take first the case of the unknowns lying on the imaging grid, 
and derive resolution limits that depend on the sparsity of the scene. Then we consider the general case with the unknowns 
at arbitrary locations. The analysis is based on estimates of the cumulative mutual coherence and a related concept, which 
we call interaction coefficient. It complements recent results in compressed sensing by deriving deterministic resolution limits 
that account for worse case scenarios in terms of locations of the unknowns in the imaging region, and also by interpreting the 
results in some cases where uniqueness of the solution does not hold. We demonstrate the theoretical predictions with numerical 
simulations. 
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1. Introduction. Array imaging is an inverse problem for the wave equation, where the goal is to 
determine remote sources or scatterers from measurements of the wave field at a collection of nearby sensors, 
called the array. The problem has applications in medical imaging, nondestructive evaluation of materials, 
oil prospecting, seismic imaging, radar imaging, ocean acoustics and so on. There is extensive literature on 
various imaging approaches such as reverse time migration and its high frequency version called Kirchhoff 
migration liiiiiii], nicitchcd fidci py ^ Alultiplc Si^iicil Clcissificcition (AIUSIC) [H21 [2^ ; the linccir 

sampling method [9], and the factorization method |25j . In this paper we consider array imaging using 
optimization, which is appropriate for sparse scenes of unknown sources or scatterers that have small support 
in the imaging region. 

Imaging with sparsity promoting optimization has received much attention recently, specially in the 
context of compressed sensing mnnns], where a random set of sensors collect data from a sparse scene. 
Such studies use the restricted isometry property of the sensing matrix m or its mutual coherence [S] to 
derive probability bounds on the event that the imaging scene is recovered exactly for noiseless data, or 
with small error that scales linearly with the noise level. The array does not play an essential role in these 
studies, aside from its aperture bounding the random sample of locations of the sensors, and for justifying 
the scaling that leads to models of wave propagation like the paraxial one m- 

A different approach proposed in mm images a sparse scattering scene using illuminations derived 
from the singular value decomposition (SVD) of the response matrix measured by probing sequentially the 
medium with pulses emitted by one sensor at a time and recording the echoes. Ruminations derived from 
the SVD are known to be useful in imaging [13 H 1 0 [23] and they may mitigate noise. The setup in 
m, which is typical in array imaging, lets the sensors be closely spaced so that sums over them can be 
approximated by integrals over the array aperture. We consider the same continuous aperture setup here and 
study the resolution of the images produced by ii optimization, also known as basis pursuit and penalty. 
We address two questions: (1) How should we chose the discretization of the imaging region so that we can 
guarantee unique recovery of the sparse scene, at least when the unknowns lie on the grid? (2) If the imaging 
region is discretized on a finer grid, for which uniqueness does not hold, are there cases where the solution 
of the £i optimization is still useful? 

By studying question (1) we complement the existing results with deterministic resolution limits that 
account for worse case scenarios, and guarantee unique recovery of the scene for a given sparsity s. This 
is defined as the number of non-zero entries of the vector of unknowns or, equivalently, the number of grid 
points in the support of the sources/scatterers that we image. We consider a geometric setup with a small 
array, where wave propagation can be modeled by the paraxial approximation. We have a more general 
paraxial model than in |21] . which takes into consideration sources/scatterers at different ranges from the 
array. This turns out to be important in narrow-band regimes. We also consider broad-band regimes and 
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show that the additional multi-frequency data improves the resolution. 

It is typical in imaging with sparsity promoting optimization to assume that the unknown sources or 
scatterers lie on the discretization grid, meaning that they can be modeled by a sparse complex vector 
p G , where N is the number of grid points. If the unknowns lie off-grid the results deteriorate. We refer 
to P31[TS] for a perturbation analysis of compressed sensing with small off-grid displacements. General tight 
error bounds can be found in m- They may be quite large and increase with N. Thus, there is a trade-off 
in imaging with £i optimization: on one hand we need a coarse enough discretization of the imaging region 
to ensure unique recovery of the solution, and on the other hand finer discretization to minimize modeling 
errors due to off-grid placement of the unknowns. This trade-off is particularly relevant in the narrow-band 
paraxial regime, where the resolution limits may grow significantly with the sparsity s of the scene. 

At question (2) we consider hne discretizations of the imaging region, to mitigate the modeling error. 
The problem is then how to interpret the result p* of the £i minimization, which is no longer guaranteed 
to be unique. We show that there are cases where the minimization may be useful. Specifically, we prove 
that when the unknown sources/scatterers are located at points or clusters of points that are sufficiently 
well separated, an £i minimizer p* is supported in the vicinity of these points. While the entries of p* may 
not be close in the point-wise sense to those of p, their average over such vicinities are close to the true 
values in p in the case of well separated points, or the averages of the true values in the case of clusters 
of points. That is to say, £i optimization gives an effective vector of source/scatterer amplitudes averaged 
locally around the points in its support. Note that question (2) was also investigated in [20], where novel 
algorithms for imaging well separated sources have been introduced and analyzed. Our study complements 
the results in [20] by analyzing directly the performance of the £i minimization and £i-penalty, and also 
considering clusters of sources/scatterers. 

The paper is organized as follows. In section we formulate the problem, introduce notation, and 
describe the relation between imaging sources vs. scatterers. Question (1) is studied in section We 
describe the paraxial scaling regime and derive resolution bounds that depend on the sparsity s of the 
imaging scene. In section]^ we study question (2). In both sections we begin with the statement of results 
and numerical illustrations, and end with the proofs. A summary is in section]^ 

2. Formulation of the imaging problem. We formulate first the basic problem of imaging s point¬ 
like sources with a remote array of sensors that record the incoming sound waves. The generalization to the 
inverse scattering problem is described in section [2^ 

Suppose that there are s unknown sources located at points in the imaging region W C emiting 
signals fj{uj) at frequency oj, for j = 1,..., s. The hat stands for Fourier transform with respect to time, and 
reminds us that we work in the frequency domain. The receivers are at locations S A, for r = 1,..., Mr, 
where A is a set on the measuring surface, called the array aperture. The sound pressure wave measured at 
Xr and frequency w is modeled by 


p{w,Xr) = ^fj{u})G{w,Xr,yj), 


( 2 . 1 ) 


where G is the outgoing Green’s function of Helmholtz’s equation. The propagation is through a homogeneous 
medium with sound speed c, and the Green’s function is 


G{u}, Xr,yj) 


e*fe|xr—yj| 
4Tr\xr-yj\' 


( 2 . 2 ) 


where k = ujjc is the wavenumber. The inverse source problem is to determine {/j}j=i,...,s from the 
measurements (2.11 at one or more frequencies uj. 


2.1. Imaging sources with ii optimization. To state the inverse problem as an £i optimization we 
discretize W with a regular grid of rectangular prisms, and let Wm C be the set of N grid points denoted 
by ij. The lengths of the edges of the rectangular prisms are the components of the vector h S called 
the mesh size. The sources may be on or off the grid. If they are on the grid, as assumed in section we 
denote by S the set of indexes of the grid points that support them. Explicitly, we define the bijective map 


2 




J : {1,..., s} —>■ 5 C {1..., N}, such that = y^, for j = 1,..., s, and S — {J(l), ■ ■ ■, J{s)}- When the 
sources are not on the grid, we let S index the nearest points Zj to each source, as explained in more detail 
in section]^ We assume henceforth that N ^ s, meaning that the imaging scene is sparse. 

Let d G be the data vector, with components p(uJi,Xr), for 1 = 1,..., and r = 1,..., Mr- The 
number of measurements is M = M^Mr. Let also Q G be the sensing matrix, with entries defined 

by G{uJi,Xr,Zj)/aj, where aj normalizes the columns of Q, denoted by gj. Absorbing the normalization 
constants in the vector p of unknowns, we obtain the linear system 


gp = d. (2.3) 

For single frequency measurements at a; = wi the normalization constant is 

so that when the sources lie on the grid there are s non-zero entries in p, equal to 

Pj = j ^ (2-4) 

Here J~^ : S —t {1,..., s} is the inverse of the mapping J. For multiple frequency measurements we simplify 
the problem by letting 

Vj = l,...,s, (2.5) 

so that all the sources emit the same known signal /(w) multiplied by an unknown complex amplitude Rj. 
This simplification is motivated by the inverse scattering problem described in section |2.2| It keeps the 
same number N of unknowns as in the single frequency case, although we have more measurements. The 
normalization constants are 

. Il/Ih = (gl/A)t) , 

and the non-zero entries of p equal 

Pj = ^J-^(j)Rj-^{j)^ j ^ (2-6) 

Note that we could have written the multiple frequency problem for unknowns, the Fourier coefficients 

fjiuJi) of the signals emitted by the sources. However, at each frequency these have the same spatial 
support, so another optimization approach, known as Multiple Measurement Vector (MMV) [TS] would be 
more appropriate. For the purpose of this paper it suffices to consider the simpler model ( |2.6[ ). 

The £i optimization (basis pursuit) formulation of the inverse source problem is 


mm 


IpIIi such that gp = d, 


(2.7) 


where ||p||i = \Pj\- ™ sectionis to determine bounds on the mesh size h so that (2.71 has 

a unique s sparse solution, equal to the true p defined in (2.4) and (2.6). The analysis is based on the next 
lemma, following from EH El HD- 


Lemma 2.1. Suppose that (2.3) has an s sparse solution p and that the cumulative mutual coherence 
p(g,s) of matrix g with columns gj of Euclidian length equal to one satisfies 


p{g, s) = max max V | (g,, g/) 

j = l,...,A S=s 


1 

^ 2 ’ 


( 2 . 8 ) 


where (•, •) denotes the usual inner product in C”, and S is a set of cardinality jS”!. 
sparse solution of (2.3) and the unique minimizer of (2.1). 


Then p is the unique s 


3 











To deal with noise and modeling (discretization) error, we also consider in 
problem [32] 


section]^ the £i—penalty 


min ^ip), 

pGC« 


^{p) = -\\gp-A\\i 


-l\\ph 


(2.9) 


with parameter 7 accounting for the trade-off between the approximation error and the sparsity of the 
unknown vector p. 


2.2. Imaging point scatterers. The problem of imaging a sparse scene of point scatterers at for 
j = 1 ,..., s, can be written as one of imaging s sources, as we now explain. 

Let the array probe the medium with a signal /(w) emitted from the sensor at Xe. Using the Foldy-Lax 
model im 1 ^ we write the scattered wave at in the form ( 2 . 1 ), with effective sources at {yj}j=i,...,s 
emitting signals 


/i(w) = f(uj)RjUj{uj). 


( 2 . 10 ) 


Here Rj is the reflectivity of the j—th scatterer, and uj is the wave that illuminates it. It is given by the 
sum of the incident wave G(w,Xe,yj) and the wave scattered at the other points 

S 

Uj{uj) = G'(a;,Xe,yj) +^(1 - 6ij)RiG{u},yi,yj)ui{uj), Vj = 1,... ,s, (2.11) 

1=1 


where Sij is the Kronecker delta. 


In the Born approximation we neglect the sum in (2.11), and simplify (2.10) as 


/iM = /(w)i?jG(w,Xe,yj). 


( 2 . 12 ) 


This can be written in the form (2.3) with entries of p like in (2.6), and slightly redefined matrix Q and 
normalization constant 


\\ ^ (47r)4|xr -yj|2|xe -yj|2' 


Multiple scattering effects can be included by solving the Foldy-Lax equations (2.11) or, equivalently, 
the linear system 


Qu = 


/G(a;,Xe,yi) 


VG(a;,Xe,y^); 


with u the vector with components Uj, and Q = the matrix with entries 


Qji = - (1 - ^ji)G(a;,yi,yj)i?/. 

Note that Q is a perturbation of the sxs identity matrix, and depending on the magnitude of the reflectivities 
and the distance between the scatterers, it is invertible. Again we can write the problem in the form (2.3) with 


entries of p like in ( 2 . 6 ), except that now Uj are more complicated and depend on the unknown reflectivity. 
An elegant solution of this nonlinear problem is in | 12 | . It amounts to solving a source imaging problem like 
(2.7), to determine the locations y^, for j = 1,..., s. Because there are multiple emitters the authors use an 
MMV approach. Then the reflectivities are estimated using the Foldy-Lax model. 

Given the relation between inverse scattering and source problems describe above, we focus attention 


henceforth on imaging sparse scenes of sources using (2.7) or (2.9). 
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Fig. 3.1. Schematic for the paraxial setup. 


3. Imaging with small arrays. The setup is illustrated in Figure 3.1 We consider a planar square 
array of aperture size a, and a coordinate system with origin at the center of the array and range axis 
orthogonal to it. The locations of the receivers are = (x^, 0), with x^ = {xi^r, 2 ^ 2 ,r) and 12 ^ 2 ,r| < n/2, 
for r = 1,... Mr- The imaging region IT is a rectangular prism with center on the range axis, at distance 
L from the array. It has a square side [—D/2,D/2] x [—D/2,D/2] in the cross-range plane, parallel to the 
array, and length in the range direction. The discretization of W has grid points Zj = (zj,Z 3 j)^ with 
cross-range vector zj = {zij, Z 2 ,j) and range z^j, an d the mesh size is h = {h, h, / 13 ). Our goal in this section 
is to estimate h so that the £i optimization problem (2.71 determines exactly the unknown sources supported 
at Zj for j G 5, a set of cardinality s. 

We begin in section |3.1| with the scaling regime and the paraxial model of wave propagation. The 
resolution limits are stated and illustrated with numerical simulations in sections |3.2| and |3.3| The setup of 
the simulations is described in appendix]^ The proofs are in section [3.4| 


3.1. Scaling regime and the paraxial model. The scaling regime is defined by the relation between 
the important length scales: the wavelength A, the range scale L, the aperture a, and the size D and of 
the imaging region. We assume for now a single frequency w, and refer to section [373| for the multi frequency 
case where another important scale arises, the bandwidth B. 

The scales are ordered as 


X<t: D <^a<^ L, L, 


(3.1) 


and satisfy the following assumptions 


^>1 

AL ^ ’ XL L ^ ' 


XL 


< 1 , 


or aD 

xlI^ 


< 1 , 


a 

XL 


D 3 

L 


< 1 , 


a 

XL 


id 


2 D. 


< 1 . 


(3.2) 

(3.3) 


Roughly, conditions (3.3) say that the imaging region is small enough and far enough from the array, so that 


we can linearize phases of the Green’s functions in Zj, for all j = 1,..., N. Physically, this means that when 
viewed from the imaging region, the wave fronts appear planar. The array aperture a is small with respect 


to the distance L of propagation of the waves, but equations (3.2) say that the Fresnel number is large, so 
we have diffraction effects. 

We show in appendix |B| that 


Mr- 

E 

r—l 


G{w,Xr,zdG{uj,Xr,Zq) 


^ik{z3j-Z3^g) 

(47rL)2 


Mr 


2L2 


(3.4) 


r—l 


where the bar denotes complex conjugate. 


Remark 3.1. The terms in {3.4) are highly oscillatory when k is large, but can be absorbed in 


the vector of unknowns. This is convenient because it implies that when two points Zj and Zq are close to 
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each other, the inner product of the j—th and q—th columns of the scaled sensing matrix is close to o ne. 
Let iF be the diagonal matrix with entry , for j = 1,... ,N, and rewrite the linear system (2.3) as 

Qp) = d. To simplify the presentation we denote henceforth by Q the n ew s ensing matrix and 
by p the new unknown vector 3f~^p, so that the scaled system looks the same as (2.3). The Green’s function 
with the large phase removed is still denoted by G , and satisfies 


J2G{uJ,Xr,Zj)G{u},5Cr,Zq) « 

-r-— 1 ' ' r — 1 


l^rl (23,i-®3,g) , 

- ^ -^- 1 - 


r—1 ^ ' r—1 

Assuming that the receivers are on a square grid of spacing h ^, satisfying the scaling relations 

XL 






(3.5) 


(3.6) 


we see that the exponential in (3.5) is approximately constant in each grid cell in A. This allows us to use 
the continuous aperture approximation in the analysis, where the sum over r can be replaced by the integral 
over A = [—a/2, a/2] x [—a/2, a/2]. 


_ I 

'^G{uJ,3Cr,Zj)G{uj,Xr,Zg) « 


dx 


L 


(^3,j -^ 3 , 9 ) I 

—- 1 - r, - 


(3.7) 


With our discretization the number of unknowns is iV = [D/h)^ D^/h^ and the problem is underdetermined 
when the number of measurements M = Mr = {a/h^) satisfies M < N or, equivalently, 




(3.8) 


which is consistent with (3.6) for /i <C I? and hs D 3 . 


3.2. Single frequency resolution limits. Using the paraxial model for our sensing matrix Q, we 

(3.9) 


obtain from (3.7) and the definition of the cumulative coherence p{Q,s) that 

,( Z2,3 - Z2,q Z3,j - 2:3,9 \ 

V H ’ m J’ 


(n \ 7/7 ^^3 Z3j Z3 q 

Ai(b, s) =. max^ max > U\—- 


i=l....,Ar|S|=i 


96 S,q^j 


H 




U\ 


where 


ka 27ra ’ 


H. = 


2 L^ 

ka^ 


XL^ 


na^ 


and lL{/3,r]) is the absolute value of the Fresnel integral 

rl/2 


U{l3,r]) = 


d^^-i 0 t-^vt^ 


1 - 1/2 


(3.10) 


(3.11) 


The search set of cardinality s is denoted by S, to distinguish it from the set of indexes of the true support 
points of p, called calygraphic S. 

As stated in Lemma 2.1 unique recovery of a sparse p with the minimization (2.7) is guaranteed when 
p{G, s) < 1/2. This criterion allows us to estimate the resolution limit stated in the next two theorems. 

Theorem 3.2. If the mesh size satisfies 


U U* 2 AL 

h> h* = -, 

TT a 


Lr ^ Lr — 


16 AL2 


(3.12) 


£i optimization recovers exactly two sources located at any distinct grid points. 
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Fig. 3.2. Illustration of recovery of two imaging scenes discretized at base resolution The results on the top 

line are for 3 sources and on the bottom line for 20 sources. The distribution of the sources is displayed in the right column. 
The left column shows a cross-section of the images. The range axis is in units of h^ and the cross-range axis in units of h*. 
The exact location of the sources is superposed on the images. They all have strength pj = 1, for j £ S, and the magnitude of 
the reconstruction is shown with the color bar. 


We call the estimates h* and /13 the “base resolution”. They are the same, up to order one constants, as the 
well known resolution limits in array imaging [7]. We verified numerically the estimates ( |3.12[ ) as follows. 
To check the value of h*, we solved the optimization problem (2.7), as explained in appendix]^ for data 
corresponding to a vector p supported at any two grid points offset in cross-range. We determined the 
smallest h so that the relative error between p and its numerical reconstruction was less than 1%. A similar 
estimation was done for hg, with p supported at points offset in range. The results were close to those in 


Theorem 3.2 0.46AL/a for h* and 3AL^/a^ for hg. 

The next result states that when there are more sources to estimate, the resolution limits deteriorate. 
This is also illustrated in Figure |3.2[ where we display images discretized at base resolution. The recon¬ 
struction is perfect for 3 sources (top line), but not for 20 sources (bottom line). How the resolution limits 
deteriorate with s depends on the distribution of the sources in the imaging region. Our estimate in the next 
theorem accounts for worse case scenarios. 


Theorem 3.3. There exists a constant C of order one such that if the mesh size satisfies the conditions 

h/h* 


= 0 ( 1 ) and 


Y ^ / 

2/lg 

\h*j 

h*3. 


1 1/3 


> 




hs/hs 

£i optimization recovers exactly s sources located at any distinct grid points. 


(3.13) 


The isotropic dilation of the mesh in (3.13) is for convenience, but the result generalizes to anisotropic 


dilations, where the mesh is stretched much more in one direction than the others. We refer to the proof in 
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section 3.4.1 for details on the generalization. The resolution decrease with s predicted by Theorem |3.3| may 
be traced to the slow decay with range offsets of the terms | {gj,gk) \ summed in ^{Q, s). This is also why 

hl/h* = 8L/a > 1. 


Sources at different ranges may have strong interaction, hence they must be further apart in order to get 
s) < 1/2. In the next section we show that if the base range resolution improves, as it does with broad 
band data, then there is almost no resolution loss with the sparsity s. 

3.3. Broad band resolution limits. When we have frequency measurements, the vector of un¬ 
knowns is defined as in ( |2.6[ ), and the rows of the sensing matrix Q are indexed by the receiver-frequency 
pair (r, j) S {1,..., Mr} x {1,..., M^^}. Let ujo be the central frequency, so that 


I ^o\ ^ ^ ^ j — fl--') ; 

where B is the bandwidth assumed to satisfy the scaling relation 

- D Ds 


1 5/ max ■ 


XoLja XoL’^ja^ 


1 Wo 

(L\ 

^ < 

- 

J B 

\aJ 


(3.14) 


(3.15) 


The lower bound says that uJo ^ B, so that uJo is the scale of all the measured frequencies ojj, for j = 1,... M^. 
The upper bound implies cjB Xgl? jc?, where we recall from the previous section that \al?‘ jc? is, up 
to a factor of order one, the base range resolution for single frequency measurements at w = Wo- The next 
theorem states that the base range resolution for multi frequency measurements is of order c/S, so (3.15) 
implies a gain in range resolution. 

Let us assume, for simplicity of the calculations, a Gaussian signal 


/(wj) = e 


(3.16) 


The results should extend to any signal with bandwidth B, with modifications of the constants in the bounds. 
We show in appendix [C] that 


Mr 


EE I /(Wj ) I^G(Wj , , Zq)(/(w^-, Xr, Z;) 

1 = 1 r=l 


(AttL)'- 


E< 

Mr. 

E 


- -- {Z3,q-Z3,t) . 


— ikn 


rr (23,g-*3,z) 


(3.17) 


where kg = oJo/c is the central wavenumber, and we proceeded as in Remark 3.1 to absorb the large phases 
g*feoZ3,g vector of unknowns. Assuming that the array is discretized on a mesh with spacing 


satisfying ( |3.6| ), we approximate the sum over r by an integral over the aperture A, as in the previous 
section. We also suppose that the frequencies uij are spaced at intervals satisfying <C cjD^, so that 


we can write the sum over the frequencies as an integral over the bandwidth. Equation (3.17) becomes 

M„ Mr 




V^B 


1=1 r=l 




-e 2^2 X 


f -iko 

/ dxe 


1Z2 1“ L 

Ia 



(3.18) 

(3.19) 


and we can simplify it further by neglecting the quadratic phase in the integral over the aperture. This is 
because 



























for range offsets in the support of order cjB of the Gaussian factor in (3.19). Integrating over the aperture 
and normalizing, we arrive at the following model of the products of the columns of the sensing matrix Q, 


I (g</,gi) I 


= g 2(c/B)^ 


/ Zl^q - Zl^l 

V 2L /{kott 


smc 


Z2,q - Z2,l \ 
2L/{koa) ) 


(3.20) 


These are the terms in the cumulative coherence fi{Q, s), and the resolution limits are as stated next. 

Theorem 3.4. Assume a sensing matrix Q with inner products of the columns defined by { '3.2(Jj) . If the 
mesh size h = (/i, h, hfi) satisfies 


h > h* = - hs > hi = V2 ln 2 —, 

TT a B 


(3.21) 


optimization recovers exactly any two sources on the grid. Moreover, there exists an order one constant 
C > 1, independent of s, such that if 


A A 


> Clns, 


(3.22) 


£i optimization recovers exactly any s sources on the grid. 

We call h* and hg the base resolution, as in the previous section. While the cross-range resolution h* is 
the same as in the single frequency case, the base range resolution hg is significantly better. Moreover, there 
is little loss of resolution at large s. The mesh size grows at most logarithmically with s, as opposed to 
in the single frequency case. This agrees with the known fact in array imaging that bandwidth improves the 
quality of images. 


3.4. Proofs. The proofs of Theorems 3.2 and 3.3 which estimate the resolution in the single frequency 
case are in section 3.4.1 The proof of the broad band result in Theorem 3^ is in section |3.4.1[ 

3.4.1. Single frequency. We begin with some basic bounds on the Fresnel integral (3.12 1 . The simplest 
estimate is for rj = 0, in which case 


= sinc(/3/2) < min{l, 2//3}. 

For r] 0 we can change variables and rewrite (|3.12[) in the form 


(3.23) 


= -p 

A? 
2^/2 
7^’ 


, 0 + V 


dti 


2 ^ 


< 




, /3 + r? 
2 v^ 


'( 


dt{ cos t^ — i sint^ 


2 ^A? 


)\r< 


dt ( cos t^ — i sin t‘ 


< 


(3.24) 


for any f} Q and /3 > 0, where we used that 


dt cosB 


< 1 , 


dt sint^ 


< 1 , 


Vo G M. 


The final estimate 


U{/3,i]) = 


Vv 


L 




dti 


0 -V 


< 


TT -f 1 


a 


for /3 > a + rj, V a, 77 > 0, 


(3.25) 


follows from contour integration, as shown in appendix [Pj 

Proof of Theorem 3.2 We wish to estimate h and hg so that p.{G, 2) < 1/2. The cumulative coherence 
for s = 2 is the same as the mutual coherence [HI [31], and in our case it takes the simple form 


77(^1,2) = ^max 

\ JrL \ JrL / 




(3.26) 
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Fig. 3.3. Surface and top view display of the Fresnel integral U{13, rf) for |/3|, \r}\ < 60. In the right plot abscissa is g and 
the ordinate is /3. 

Here we used that the sources are on the grid and denote by C = (Cij C2) Cs) vectors with integer components. 
We display in Figure 3.3 the Fresnel integral U{/3,ri), an d not e that it is bounded above by 1, and larger 
than 1/2 for C near the origin. When C3 = 0 we get from (3.23) that 




/iC2 


< 


2H 

TT’ 


because at least one of (/i and C2 is not equal to zero. If Cs 0 we have by (|3.24|) 

U 




'hC 2 

h3C,3\ 


V H ’ 

H 3 ) ^ 

. H ’ 

H 3 ) 

~ ^3 


Thus, is guaranteed to be less than 1/2 if ft, > 4iJ = ft* and fta > I 6 H 3 = h\. This concludes the 

proof of Theorem |3.2[ □ 


Proof of Theorem|3.3[ Note from the expression (|3.9|) of the cumulative coherence that it is translation 

(3.27) 


invariant in K'^. Thus, we can fix the origin at one source location and rewrite (3.9) as 

s) = max > U 

|A|=s-l^ 


(Ki 


'K 2 

^3C3\ 

V H ’ 

H 3 ) ^ 

. H ’ 

H 3 ) 


CGA 


where A C \ {0} is a set of cardinality s — 1. The proof of the theorem follows from the bound on fj,{G, s) 
stated in the next lemma and the relation ft* = AH and ft^ = I 6 H 3 established above. The constant C in 
the lemma is the same as in (3.13). 


Lemma 3.5. There exists constants C, Ci and C 2 of order one such that 

<25/3c 

|A| = s-l 


CGA 




'K 2 

h3C3\ ^ ^5/3(j 

\ H 

H 3 ) 

H 

H 3 ) - [ 


1/3 


(ft/ift)2ft3/ift3j 
H 3 , „ /i7\2 

3 


Ci^lns + C2(^yin\ 

fts V ft / 


(3.28) 


The assumption in Theorem 3.3 that the mesh is dilated in an isotropic fashion, and the relations H = h* jA 


and H 3 = ftJ/lB established above imply 

2 1 1/3 


{h/H)^h3/H3 




s2/3 

In s 


^3 

H 3 


2/3 


= 0 


g2/3 

In s 
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and 


r s" 

1/3 

s2/3 h 

[(h/Hyhi/H3\ 


(f)'i»“» 


In^ s H 


^)/(l 


-I 1/3 


= o 


5^/3 h 
In^s H 


Thus, the first term in the right hand side of (3.28) dominates the others for large s and h> hi, = AH, and 


the result stated in Theorem 3.3 follows. For anisotropic dilations of the mesh the logari thmic terms in (3.28) 
may become important. For example, when h^/H^ > {h/H)'^s^ / \tPs, the last term in (3.28) dominates the 


bound, and we can get a unique solution for a modest mesh stretch in the cross-range direction h/H = O(lns) 
at the expense of a very large stretch in range h^/H^ = 0 (s^/ln^s). 


Proof of Lemma 3.5: Let A* be the set on which the maximum in (3.27) is achieved. It is difficult to 


determine A* explicitly, so we construct another set A^, which allows us to bound the cumulative coherence. 
We use the behavior of the Fresnel integral U, shown in Figure [01 to guide us in the constructi on. W e write 
At- as the union of two sets A+ and A°. The first set contains the left and right cones (in Figure 3.3 they are 


defined by the diagonal lines |/3| = |? 7 |), where U displays a slower decay, as well as a vicinity of the origin 


H 


— < C G S.t. 0 < ICal < , , rr I ICll; IC2I < yy 


hs/Hs’ 




IC3I 


The second set is for the points with Ca = 0, 


A°= CGZ 3 \{ 0 } s.t. |Ci|,|C2|< 


(3.29) 


(3.30) 


h/H } ■ 

The parameter r > 1 is used to control the volume of At- = A+U A°, so that it contains at least s grid points. 


T = mm • 


‘is / h 

Y\h) Hs 


1 1/3 



The proof consists of two steps. First we derive the bound 

E"(' 


C6A. 


( ^Cl,q 

^aCa,?' 


^3C3.9\ 

\ H 

’ Hs - 

r \ H ’ 

Hs ) 


< E ^( 0 ’ 

C6At 


(3.31) 


(3.32) 


where 


^(C) = 


f 2V2(7r+l)H3 


1 

A 3 IC 3 I ’ 



Xi.O + 7,101 J 

x 1 ^-H(1-‘'C2,o) 

X 2 .O+ 7,101 


if C e A+, 
if Ce AO. 


(3.33) 


Then we estimate the sum in the right hand side of (3.32). To prove (3.32) we show: 

(i) A+ contains at least s grid points. 

(ii) For any C G A,-, we have 


U 


hCi (h(^2 


H ’ Hp 


X 


(iii) For any C i K, 


U 




'hC 2 

CO 

V H ’ 

-ffa y ^ 

. H '■ 

H 3 ) 


H m 


< 


<z{C)- 


= min Z{(^). 
CGA+ 
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By showing (i), we establish that there are at least as many terms to sum on the right hand side of (3.32) 
as on the left, and we can define a one to one map ^ : A* —>■ A,-, such that ^(C) = C if C G and 

^{C) G A+ if ^ e A* \ ^A* n At^ . Points {ii) and {iii) ensure that 




VC G A*, 


from which (3.32) follows. 


Proof of (i): This statement is trivial when r = sh/H or t = sh^/Hs, because A+ has cardinality larger 

2 . -11/3 


than s, by definition. Thus, let r = 


Ai 

H3 


E 

CeAt C36Z.0<|C3l<'rff3/^3 

'Hh3\^ 


and calculate the number Sr of grid points in as 

2 

ICal 


H 






> 4 


KhHs) 

>4^) 

- \hmJ 


j: i<3t 


C3eZ,0<|C3|^'J'-f^3/^3 


fHhs\^2 /rH3\^ 


3 V /i3 


8 

3 VX 


^3.^3 


T-^ > S. 


The first inequality is by definition of A+. The factor 2 is due to the absolute values and [-J denotes the 
integer part. The second inequality is because we omit one positive term in the sum. The third inequality 
is by direct summation 

.2 n{n + l)(2n + 1 ) ^ 


1=1 


6 


Again the factor 2 is due to the absolute values. The last inequality is by definition of r. This concludes the 
proof of (/). 


Proof of (ii): This follows immediately from bounds (3.23) and (3.24). 


Proof of (Hi): We note from definition (3.33) of 2^(C) and (3.29) that 

. 272(71+1) 

mm Z(Q) = . 

CeA+ T 


Now consider an arbitrary C ^ A^ . Recalling definitions (3.29)-(3.30), we distinguish three cases: 
1 - If ICsl ^ we have by (3.24) that 


U 


(Ki 

> 

CO 

'K 2 

Il3C3\ 

V H ■ 

Hs > ' 

. H ’ 

Hs ) 


< 


8 8 272(71 + 1 ) 

^3|C3|/-ff3 ~ T ^ T 


2 . If Cs = 0 we can assume without loss of generality that |Ci| > tT/? since at least one of Ci and C2 must 
satisfy this condition. We obtain from (3.23) that 

272(71 + 1 ) 


«(f,o)«(f,o) 


< 


2 ^ 2 ^ 


3. If 0 < IC 3 I < we can assume without loss of generality that 




7^3|C3|/-ff3 ^3 


IC 3 I 
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since at least one of (i and C 2 niust satisfy this condition. Then 


"(If..|<>) s 


'3 , \ ^ tt + 1 I hs 

Tj- Ksh 
tl?, 


by estimate (3.25), and using (3.24) for the other Fresnel integral we get 


(Ki 


'hC2 

^3C3\ 

\ H '■ 

H 3 ) ^ 

. H ’ 

U 3 ) 


This concludes the proof of (3.32). 

It remains to estimate the right hand side in (3.32), 

t/A 2V2(7r + l)iJ3 1 ^ , 2i/(l-(5^ 0 ) 




. + IC3I . 

CeA+ cgAo 


hKi 


, , 2 i/(l-%,„)! 


For the first sum we have by the definition (3.29) of A+ that 

1 . H- 


H 3 ^ ^ 

hs ^ IC3I “ ^3 ^ ^ IC3I 

CeA+ C36 Z.o<IC3|<^ 


2H 

1 + ^ 


n 2 


^ E 

C36Z,0<|C3|<^ 


16 


(f)^ 


^ W^3|C3|/i^3 

2i?3 


:2 + #IC3| 




^3|C3 


Now using that 


i=i 


n{n + 1 ) 


n , 2 '*1 

E l TT ^^ 1 t / \ 


and substituting in the bound above, we get 


El 

hs 


C6A+ 

For the second sum in (3.34) we have 


T + 


f = l 




i=i 
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2 H 3 


1 + ln 


(f) 


E K-c 

CgA? 


2 g(l-%.o) 

/ilCil 


'^C 2,0 + 


2 g(l - %,o) 

h\C2\ 

2H 


~ E [%.o(l-%.o 


CeA? 


2 Fr 


'^C 2 .o(l - + (1 - %,o)(l - %,o) 


2H 2H 

WXWXl 


because we cannot have both Ci and (2 = 0. The first term is bounded 


as 


nTT ATT L' ^-^/ Aj .. 3 fJ 

£6A; ' ' <-i 

and similar for the second term. For the last term we have 


E “ %.o)(l - <5 c2,o) 

CGAO 


2H 2H j2HY A / , , „,^m 2 


(3.34) 
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Gathering the results we have for large s, and therefore large r, 




c. 


Cga^ 


In^r 


(3.35) 


with constant C close to 16-\/2(7r + 1), Ci close to 2 and C 2 close to 16. Here we used the expectation that 
h > h* = AH and > h\ = IQH. To finish the proof of the Lemma we obtain from definition (3.31) of t 
that 


Ih 

h:i 


In 


Inr < 


hs/Hs 


In s 

■ hs/Hs + ® 


and similarly for H/hlnr, where we used that \nx/x attains its maximum over the interval [l,oo) at 

X = e. Moreover, we note that the first term in (3.35) is negligible in comparison with the others unless 
, 2 , -| 1/3 

. Substituting this expression of r in the first term and using that s is large, we get 


3s 
8 

Lemma 


(a] hiL 

\h ) H 3 


3.5 


with constant C close to ( 3 / 2 )^/^( 7 r + 1), Ci close to Ci and C 2 close to € 2 - This concludes the 
proof of Theorem 3.3 □ 

3.4.2. Broad band. The proof of Theorem |3.4| is similar to that of Theorems m and |3.3[ with modi¬ 
fications that account for the faster decay with the range offset of the inner products (3.20) of the columns 
of the sensing matrix. 

Using the translation invariance of (3.20) and writing explicitly that the points are on the grid, we 
can write the cumulative coherence as 


/i( 0 , s) = max 1 2 ^ e ^ W 3 J smc(^^^j smc(^^^j 


(3.36) 


CgA 


Here A e \ {0} is a set of cardinality s — 1, as before, and we introduced the notation 




V2c 


n = 


2L 

knCL 


To derive the base resolution limits, let s = 2 in (3.36|) and observe that 

n 


V '”3 / smc y ^ j sinc^ ^ j < e '' '”3 / , for C 3 Oj 


n 

uniformly in Ci) C2 G and when ^3 = 0, 

)r , 1 - %.0l , 1 - %.0l 

Thus, ^(5,2) < 1/2 when T-L/h < 1/2 and < 1/2 or, equivalently, when 

h > 2% = h* and = hg. 


smc —r— 

■ (h\C2\\ 

smc —r— 

< 

\ n } 

\ n } 

. 


(3.37) 


(3.38) 


(3.39) 


as stated in equation (3.21) of Theorem 3.4 


(3.39) to write 


To prove the second statement of Theorem 3.4 for large s, we use assumption (3.22) and the relation 

(3.40) 


^ ^ fl 7^1 


for a constant C that is slightly larger than C. We also define Z : Z3 


Z(C) = e 


- .-rcl 


%.o + 


1 -^, 


Ci.o 
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(3.41) 























































and using (3.401 in (3.36) we get 


fi{Q, s) < ^ max 


|A|=s 


(3.42) 


CGA 


Now let A* denote the maximizing set in (3.42). We do not know it explicitly, but we can define a 
one-to-one mapping from A,,, to another set A,g which allows us to bound ^{Q,s). The construction of the 
set 


a^ = {cgz 3\{0} s.t. 2(0 >Y^} 


/JsJ’ 


(3.43) 


is motivated by the rapid decay in range of the terms in the s um in (3.36). Explicitly, we note that when 
C G A /3 we have Ca = 0 because if this were not true, definition (3.41) would give 

2(C) < = —. 

I3s 


Here we assumed /3 > 2, which is consistent with (3.40) for large s, and since /3 > ln/3, we also have 
2/3 > ln(/3) -I- Ins = ln(/3s). 

Let A^ be the intersection of A^ with the Cj axis, for j = 1, 2. Then, the cardinality |A^| of the set Ap 
satisfies 


|A/ 3 | > |A^| -I- |A^| — 4s, 


because by definition (3.43), Cj G A^ means that |((),| < s, for j = 1,2. Thus, there are at least 4s points 
in A^, and we can define a one to one mapping from the maximizing set A* to A^g. Moreover, since for any 
C ^ Ap we have Z{C) < l/(/?s) we conclude from (3.42) that 


M^,s)< E ^(0- 


To bound the right hand side in this equation, note from definitions (3.41) and (3.43) that Ap is contained 
in the punctured disk Vs of radius s. 


= {c G \ {0} s.t. C3 = 0, |ClUC 2 |<s}, 


and obtain 


z{c) = n E 

CeVs i=i|OI<s 

41ns 


l-<3, 


’ 0.0 


0.0 


41n^s 


/3|0I 

4 In s 


- 2 { 0 ) < 


1 + 1(1 


1 2 


P' 


In; 


- 1 


/32 


P 


4 


-(1-1- Ins) 


The proof of Theorem |3.4| is completed_with the o bservation that we can make the bound in this estimate 
less than 1/2 by choosing the constant C in (3.40) large enough, independent of s. □ 


4. Imaging on fine grids. The resolution estimates in Theorems 3.2||3.4| do not account for noise 
and modeling errors due to off-grid placement of the sources, which may be large for coarser discretizations 
required by the theorems. In this section we mitigate the modeling error by discretizing the imaging region 
on a fine mesh, and then study the results of the £i optimization. As the results in the previous section 
shows it is impossible to have a meaningful answer for arbitrary distributions of the sources. However, 
if they are located at points or clusters of points that are sufficiently far apart, the results are useful, as 
illustrated in Figure |4.1[ In the left image we display the result o f th e optimization for a discretizations of 
the imaging region at the base resolution h* defined in Theorem 3.2 If the sources were on the grid, the 
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Fig. 4.1. Effect of modeling error due to sources off-grid. From left to right: Image discretized at base resolution, at 1/2 
base resolution and 1/4 base resolution. 


reconstruction would have been perfect. Here the sources are off-grid, and the reconstruction is poor due 
to the modeling error. The result is better in the other two plots because the modeling error is reduced by 
taking a smaller mesh size. While in general a discretization at 1 /4 h* does not guarantee a good recovery, 
here the result is good because the sources are far apart, and thus have little interaction. The analysis in 
this section formalizes this observation. We point the reader to [5D] for a different study of similar ideas, and 
algorithms designed to take advantage of the weak interaction between the sources. Here we study directly 


the optimization problems (2.7) and (2.9) and consider in addition clusters of sources. 


We begin in sect ion|4.1| with the statement of results for well separated sources and then consider clusters 
of sources in section |4.2[ The proofs are in section |4.3| 


4.1. Statement of results for well separated sources. Let us modify the notation slightly, and 
call gy the normalized vector of Green’s functions taking us from the array to the point y in the imaging 
region. When y is a point Zj on the grid, then gy is the same as gj defined before, and we have the simpler 
notation gj = . We quantify the interaction between two sources located at y and y' in the imaging 


region W by the value of 


(gy,gy') . Explicitly, in terms of the semi-metric ^ : W x W —>■ [0,1], 


V y, y' G W, 


^(y,y') = l- (gy,gy') 

that defines the open ball 

Br{y) = {feR^ s.t. ^(y,y')<r}, 

we say that points y' outside Br{y) have a weaker interaction with y than the points in the ball. 


(gy! gy' 


< 1 — r. 


Vy' ^ Br{y). 


(4.1) 


(4.2) 


(4.3) 


In these definitions it does not matter if we have single or multiple frequency measurement^ In both cases 
we know from the previous section that (gy, gy/) is a function of y — y' which peaks at the origin, and is 
monotonically decreasing in its vicinity. This means that there exists a small enough f such that if r < r 
and y' G Br{y), y and y' are close in Euclidian distance. 

Suppose that we have s sources in the imaging region W, supported at points in the set y = {fj, j = 
1,... s}, and discretize W on a grid with N points Zj and mesh size h that is as small as needed to mitigate 
the modeling error. We define the interaction coefficient of the set y by 


liy) = max 

q=l,...,N 


(gyjigij) 


yjey\{.yV{Sff} 


(4.4) 


*The theory applies to both single frequency and multiple frequency measurements, but the numerical simulations are for 
a single frequency. 
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where cyV(zq) is the closest point to Zq in y, with respect to semi-metric Here it is possible to make 
T(y) independent of the mesh by replacing the maximum with supremum over the imaging window which 
is an open subset of Note that T(y) is similar to the cumulative mutual coherence ^{G, s), except that 
the set y is fixed and the points in it may not be on the grid. Note also that X{y) is well defined even when 
there are multiple points in y that are closest to Zq. In such cases we let ^{zq) be any one of these points 
without affecting the value of I{y)- 


The next two theorems describe the support of the £i minimizer. Theorem 4.1 and its corollary are for 


formulation (2.71 of the optimization problem, which assumes an exact model. Theorem 4.3 is for formulation 


(2.9) which accounts for noise and modeling error. 

Theorem 4.1. Suppose that the unknown sources are supported on the fine grid at points Zj enumerated 
by the set S of cardinality s, and are represented by the s-sparse vector p G , satisfying Gp = d. The 
sources are assumed sufficiently far apart so that for some r G (0,1) the balls Brizj) are disjoint. Let p* be 


the solution of the optimization problem (2.1) and decompose it as 


P* = P 


(i) 


(o) 
P* , 


(4.5) 


where supppi*^ C Br{zj), and pi°'^ is supported in the complement of this union. Then, 
jes 


(o) 

Ip* 


< 


2i{y) , 


ip*iii- 


(4.6) 


This theorem says that when the interaction coefficient T{y) is smaller than r/2, the support of the 
optimal p* is concentrated in the vicinity of the sources. The next corollary quantifies the error of the 
reconstruction. 

Corollary 4.2. Under the same assumptions as in Theorem \4.1\ the error of the reconstruction is 
quantified by 


1p-p*||i<^I1pI1i, 


(4.7) 


where p* is the effective source vector in with j—th component given by 


P*9 (S3,Sq), 

P*3 — S 

0 , 


forj G S, 
forj i S, 


(4.8) 


where p*J denotes the q— th component of pi*^ and is the se0 of indexes of the grid points supported in 

Br{Zj). 

Note that p* is an s-sparse vector of the same support S as p, but with entries given by the “weighted” 

( 2 ) 

sum of the components of p*' supported in the vicinity of each source. When the radius r is small, the 
complex weights {gj,gq) are close to one, and p*j is approximately the sum of the components of pi*^ 
supported in y^j. Furthermore, (4.7) implies that when is sufficiently small such that the right hand 

side of (4.7) is less than min^g^j |pj |} , p* has a non-zero component in the r-neighborhood of every source 


location. 

The statements of Theorem 4.1 and Corollary |4.2| are already illustrated in the right plot of Figure [TT| 
Additional examples are in Figure 4.2 where we show numerical reconstructions for two sources (left plot) 
and five sources (right plot). We display the balls Brifj) in green, and the entries in p* with stars of size 
proportional to their magnitude. In the left plot (see also the zoom in Figure 4.3) the sources have weak 
interaction I{y) = 0.086, and for r = 0.009 we have ||pl°^||i/||p*|ji = 2.4% and ||p— p*||i/||p||i 


= 18%. 


^Note that the set depends on r, but for simplicity we suppress the dependence in the notation. 
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Fig. 4.2. Reconstructions of two sources (left) and five sources (right). The support of pi, is indicated with a star of size 
proportional to its magnitude. The balls Br{yj) ciTe drawn in green. The radius is 0.009 in the left plot and 0.11 in the right 
plot. The axes are range and cross-range in units of base resolution h'^ and h*. 
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Fig. 4.3. Zoom of the image displayed in the left plot of Figurearound one of the sources shown with a blue square. 


For r = 0.005 the error drops to 1.14%, however Up* |ii/|!p*||i grows to 19%. That is to say, roughly 80% 
of the amplitude of the reconstruction is accumulated very close near the sources. In the right plot the 
interaction coefficient is larger l{y) = 1.43, but the reconstruction is still good, ||pi°^||i/||p*|ji = 0.5% and 
l|p- P*lli/I|p||i = 10% for r = 0.11. 

Note that in both simulations the support of the reconstruction is much better than predicted by Theorem 


4.1 which gives a pessimistic bound for r > 21{y). A sharper estimate may be obtained under the additional 


assumption that all the entries in p are positive, by taking advantage of cancellations of the oscillatory terms 
in the sums analyzed in the proof of the theorem in section |4.3.1[ However, this is difficult to do without 
making strong assumptions on the geometric distribution of the sources in the imaging region. 

The next theorem considers the more general case of an inexact model, due to noisy data and sources 
off the grid, and uses the penalty formulation (2.9). The result is stronger than in Theorem 3.2 as it 
states that the minimizer p* is exactly supported in the vicinity of the sources, for large enough penalty 
parameter 7 . This is somewhat expected, as increasing 7 in (2.9) means putting more emphasis on having 
a smaller norm i.e., a sparser solution. What is interesting is that the support of this sparse solution is 
guaranteed to be near that of the unknown sources. However, increasing 7 comes at the cost of a larger 
residual \\Qp^, — d\\ 2 , and there is no guarantee that the error of recovery of p is small, as in Corollary 4.2 


Theorem 4.3. Consider s sources supported in the set y = {y^-, j = l,...,s}, with interaction 
coejficient X{y) < 1/2, so that we can find an r & (0,1) satisfying r > 22{y). Then, for sufficiently large 
penalty parameter 7 , that depends on the noise and modeling error, the minimizer p* of !j2.£^ is supported 
in Ui=i^r(yj )- 
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Fig. 4.4. Numerical simulation for two clusters of sources. The support of pi, is indicated with a star of size proportional 
to its magnitude. The balls Br of radius r = 0.017 are drawn in green. The axes are range and cross-range in units of base 
resolution h^ and h*. 


4.2. Statement of results for clusters of sources. Here we give the generalization of Theorem |4.1| 
to clusters of sources. We assume for simplicity, as in Theorem |4.1[ that the sources are on the grid. The 
result extends to sources off the grid by modifying the proof of Theorem |4.3[ 

Let us define the effective support Sg C S of p £ C^, for some e S (0,1), so that 

3^ = {yi,-■ • ,ys} = {zj, j e 5} c jj He(z,), (4.9) 

q€:Se 


where 


H,(z,)nH,(zO=0 yi,q&S„l^q. 

More explicitly, we cover the set y of locations of the sources with disjoint balls of radius e, centered at 


points in S^. The set Sg is the support of the effective source vector p, with entries defined similarly to (4.8) 

for j eSe, 
otherwise. 


Pj = 


PqiSqT^j) 

q&SnBe{Sj) 

0 , 


(4.10) 


Obviously p depends on e, but we suppress the dependence in the notation. When £ 1, meaning that 

the sources are tightly clustered around the points in 5^, the effective source is approximately the sum of 


the entries of p supported in the cluster. When £ is larger the complex weights in (4.10) can be far from 


one and oscillatory, so there may be a lot of cancellations in the sum in (4.10). Cancellations (destructive 


interference of sources) can arise for tight clusters as well, when the entries in p in a cluster have opposite 
signs. 

The result stated in the next theorem says that if the clusters are far apart and there is little destructive 


interference of the sources in the cluster, the support of the optimizer p^ of (2.7) is concentrated near the 
sources. 

Theorem 4.4. Suppose that the unknown sources are supported on the grid at points enumerated by 
the set S, and th at t here is an e & (Oil) for which we can define the effective support Sg. Let p* be the 
£i minimizer of {2.7} and decompose it as p = pi*^ + pl°\ where p*^ is supported in the disjoint union 
Br{ij), for r satisfying e < r < 1, and pi°^ is supported in the complement of this union. We have 




< 


2I(3^e 


|P*l|l + 


IpIIi-IIpII 


(4.11) 


where = {zj, j G 5e} is assumed to satisfy T(3^e) < 1- 

The assumption X{yff) < 1 is used in the proof, but for the estimate ( |4.11 1 to be useful we need 
X{ye) < rl2 < 1/2. This is because by definition (4.10) of p we have |lp||i < ||p||i and the bound is larger 
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than Up*111 for X{yi,) > r/2. As before, the estimate in the theorem is pessimistic. The numerical results 
are better, as illustrated by the simulation in Figure |4^ There are five sources with amplitude equal to 
one, and locations indicated in the plot with blue squares. One source is isolated and the other four form a 
cluster, so has cardinality two. The interaction coefficient of the set y is large, T{y) = 3.02, because of 
the cluster, but I(A’e) = 0.096. The balls shown with green in the figure are for r = 0.017, and the entries 
in p* are indicated with stars of size proportional to the magnitude. The error is ||pl°^||i/||p*||i = 0.54%. 
The restriction of p* to the ball containing the cluster has li norm 2.3. The restriction to the other ball has 
norm 0.96, which is close to the amplitude of the isolated source. 


4.3. Proofs. We begin in section 4.3.1 with the proofs of Theorem 4T and its Corollary |4.2| Theorem 
|4.4| for clusters of sources is proved in section |4.3.2[ The proofs of Theorems |4.1| and |4.4| are similar but the 
proof of Theorem |4.3|for penalty reconstructions is more involved. We present it in section [4. 3. 3[ 


4.3.1. optimal reconstructions of well separated sources. Theorem |4.1| and its corollary as¬ 
sume an exact model, with well separated sources on the grid, at points indexed by S. Recall that is 
the set that enumerates the grid points supported in Br{zq), for q G S. The balls Br{zq) are disjoint by 
assumption, so each nonzero entry in p* ^ is contained in exactly one ball and 


supppi*^ = [J yq. 

qGS 


By definition of p* we have Gp = Gp* or more explicitly, 


qes qeSjey 


(4.12) 


where we denote the support of pi°^ by = {1,..., A^}\ yj. The proof of the theorem and its corollary 

jeS 

amounts to estimating the inner products of the left and right sides of equation (4.12) with a carefully chosen 
vector u, as shown next. 

Proof of Theorem 14.11 Let us define the vector 


u = ^sign(p,)gg, 
q^S 


(4.13) 


where “sign” denotes the complex sign function, and take the inner product of u with the left and right 

=: Tr, (4.14) 


hand side in (4.121. We obtain 


I H P‘1 ^§9’= I H H p*1 + H p*j u) 

q&S qGSje^g je^‘= 


where obviously the left hand side 71 equals the right hand side Tr. We distinguish them here so we can 
bound them separately below and above. 

For 71 we have 

Pq SigB-iPq) + Pq H Sign(pj ) (g„ gj ) | 

= I H [id?! +P <1 H sign(pj) (g„gj) 

9G5 jeSJ^q 

>J2\p<i\-J2\p<i\ H i(g<?.gj)h 

9G5 ijG5 


where we used the definition of the sign function and the triangle inequality. Since 


H l(gg,gj)l 

j^S.j^q 


<max |(g„gj)|< max Y l(g<?> gj)l = ^(3^). 

q=l,...,N 

JeSJ^q jeS.j^q 
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and p is supported on 5, we get 


rL> II Pill [1-1(3^) 

For Tr we have by definition ( 4.13[ ) of u that 

= I H + X sign(pi) (gj,g/) + ^ ^p[°^sign(pi)(gj, 


(4.15) 


Si) 


qeS je^g 

and using the triangle inequality 


ies,i^q 


leS 


Ti?<x X ipijiKgi’g?)!+ X X i^Si X i(gj>g')i+ X ipi?iXi^sj’S')i- 

q&S jG^g q&SjGJ^g lGS,l^q l&S 


In the first term in (4.16) we have 


because j G S^q. In the second term 


l-r< |(gj,g 5 )| < 1, 


X l(gi,&)l<I(3^), 

ies,i^q 

by definition ( |4.4[ ) of the interaction coefficient and the fact that Zq is the closest source point to Zj. To 
bound the third term in (4.16), recall that ^(zj) is the closest source point to Zj, for j G . Its distance 
from Zj satisfies !)0{zj,jV{zj)) > d, by definition of and therefore 


Moreover, 




X l(gi,&)l<I(3^), 

IgS ,zi^^{zj) 


SO in the third sum in ( 4.16[ ) we have 

Xl^Si,gi)l < llpi°^lli[i-?' + 2:(3^)j, Vj 


les 

Thus, the bound on Tr becomes 

rfl< ||p?^lli[i + i(3^) 

To complete the proof use that |lp*||i = ||p*^||i + ||pi°^||i in (4.17) and obtain from equations (4.14) and 
(4.15) that 




l-r + I{y) 


(4.17) 


lP*lli 


I-Ay) < iipIIi ^-Ay) 


< iip*ii 


i+Ay) 


^iipi°^iii- 


The first inequality is because p* is the minimizer in d^ . Statement ( |4.6[ ) follows from this equation. □ 
Proof of Corollary 4.2; We start with equation (4.12[) and take inner product with vector 


u = X 

q^S 


Uq = sign(p, - p*g). 


We obtain 


X p<i u) - XIX p*] = X ^2 u 


qGS 


qeS jes^g 


je^‘= 


=--rR, 


(4.18) 


(4.19) 
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and proceed as in the previous proof by bounding both sides of this equation. 

For 71 we have 

= ~ ^* 9 ) + J2p<i (sq,sj)-J2J2 p*j gi) 

geS q£S jeSJ^q qGSjGS^g lGS,l^q 


where we used definition (|4.8[) of the components of p*. We bound it as 


Tl>J2\P‘i-P*i\-J2\p<i\ H l(gi.gi) 

qeS qeS jeSJ^q qeSjeJ^g leS,l^q 


> IIp- p*lii - 


1 + Up* ^ 111 


)Ay), 


(4.20) 


using the triangle inequality and the definition of aq and 1 {y)- 

For 71 we have by definition (4.18) of u and the triangle inequality that 

71 < ^ |pi?ll(gi,g.>-(z,))l+ H I(gi:g9)l- 

qes,z,^^{zj) 

The right hand side in this equation can be bounded as in the proof of Theorem |4.1[ and the result is 


Tii < |lpi°^||ifl-r+Z(3^i) 


(4.21) 


Now equations (4.19l-(4.211 give 


Ip - P*l|i < (IIPlli + llP^^lli ji(3^) + llpi°^l|i [1 - r+ i(w) 

= (||p|li + llplii)i(l^) ' 

< [\\p\\, + \\p4,)iiy) 


||pF'l!i(l-r) 


llP*lli 


2(1 - r)I{y) 


with the last inequality due to Theorem 4.1 Corollary 4.2 follows from this inequality and ||p* 111 < IIpIIi- n 

4 . 3 . 2 . £i optimal reconstructions of clusters of sources. The proof of Theorem |4.4| is a slight 
modification of that in section |4.3.1| We begin by defining the index map : 5 —)■ iSg that takes any j G S 
to JsU)j the index of the point in Sg at the center of the ball containing Zj i.e., Zj G Se(zj^(j)). Obviously, 
the restriction of 7^ on 5 H is the identity map. 

Using the definition of p* and its decomposition in pi*^ and pi°^ we obtain the equivalent of equation 


(4.12) 


=1111 P*hj + P*hl^ 

qes qeSg jes^g jG.y’- 


(4.22) 


where = {1,...,N}\ S^q. We take the inner product of both sides of this equation with vector 

qeSe 


u = X! ^9^9’ ^^9 = sign(Pg), 

qGSe 


(4.23) 


and get 


■= I'll Pi ^§9>u) 

q^S 


H H pIj pIj ^Si,u) 

q&Sg jej^g 


=--Tr, 


where 7 l and 71 denote the left and right hand side of the equation, as before. 


(4.24) 
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For 71 we have 


Tl = I ^ fT, Pj (S3,Sq) H Pi gg) 

gs-Se jesnBeiSq) q&Se jes,zj^Be{zq) 

= I +J2^1 Pi iSj,Sq) 

q&Se q&Se j eS ,Zj ^Bq (Zq) 

> \Pi\-J2\Pi\ l(gj.gg>l 

q&Se j&S qeSe,q^Je(j) 

> E \pq\-J2\pi\^(y^)- 

qeSq jeS 


The equality in the second row is by definition (4.10) of the effective source vector p and definition (4.22) 
of Gq. The bound in the third row is by the triangle inequality and in the last row by definition of T{yg). 
Thus, the left hand side of (4.24) satisfies 


rL>||p||i-||p|iiX(3;,). 


(4.25) 


For the right hand side 71 we have from the definition of u and the triangle inequality 


7i = |E E4? 

s i: i: ii’i 

q&Se i^Sl’q 


(gj,gg) 


{Sj,Sq) 


- E + E E 

leSq,l^q leSq 

E KSi’S?)! + E E I I 

IGSs ,1^(1 ^£tSe 


In the first term we can only say that | (gj,gg) | < 1, because the points indexed by =5^ are all clustered 
around iq. The sum in the second term is bounded by the interaction coefficient of the set I’e, and for the 
last term we have 

E EII = E \p*j\[\{sj^sjqU ))I + E I^Sj.gi) 

jej^q- iGSq ieSq,i^Jq{j) 

^ E \p^:]\[i-r+i{ye) ■ 


The upper bound on 71 becomes 

TR<\\p^'^\\l\l+I{ye) 


llpi°^l!i 


l-r+I{y,) 


= \\p*\\l 


i+Aye) 




and substituting it in (4.24) and using (4.25), we get after some rearrangement 

l+Aye) 


Iplli-llplli + lplli 1-1(3^.) 


< WpA 




(4.26) 


The statement of Theorem 4.4 follows from this, the assumption that I{ye) < 1 and |ip*|ji < ||p||i- Q 

4.3.3. penalty reconstructions of well separated sources. Before giving the proof of Theorem 
4.3 let us introduce some notation. The sources are at points in I* = {y^, q = 1,..., s} which may be 


off-grid, and we let 5^ be the set of indexes of the grid points in the r—vicinity of the sources, so that 


ij&yjBr{yq), VjGJ^. 
g=i 


(4.27) 


The complement of the set is 5^'^ = {l,...,iV} \ y. For any vector u G C^, we denote by its 
restriction to the set SP. This is a vector of length \y^\ < N. We also let Qqy, be the M x \S^\ matrix with 
\5^\ columns gj, for j G y, and denote by 7^^ the orthogonal projection on the range of G^,. It is given by 

r^=G^Gl, (4.28) 
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where f denotes the pseudo-inverse. 

The proof of Theorem |4.3| is based on the next two lemmas. The first uses results in convex analysis, 
specifically the sub-gradient of a convex function, defined in [55]. We need here the sub-gradient of the £i 
norm function evaluated at u G C^, which is shown in [35] to be any vector in the set 

9||u||i = {^ G s.t. = sign(ui) if ^ 0 and |^i| < 1 if Uj = 0}. (4.29) 


The second lemma estimates the Lagrange multiplier 7 needed to prove the theorem. 
Lemma 4.5. 


Let p* 


minimize the augmented Lagrangian ^{p) defined in (2.9), over vectors supported 
in y, with its restriction to . Then, there exists a sub-gradient vector ^ G 9||p*^||i such that 


-h 7^ = 0, 


(4.30) 


where the index H denotes the Hermitian adjoint ofGj^, a matrix in . Moreover, if we let S G d/’ he 

the set of s grid points that are nearest the locations yj of the s sources, we get 

G^G^ip.^-Gld)+^^,=0. (4.31) 


Proof: For any p supported in we have ||p||i = ||p_y,||i, and using Pythagora’s theorem 

\\Gp-dg = \\G^P^-V^dg + \\d-r^d\\l 
Therefore .if’(p) = .if^(p^) -I- |||d — with defined on vectors of length |c5^|, 

= \\\G^x-V^d\\l+-f\\yi\\i, VxgC'-^I. (4.32) 

We conclude that p*^ is the minimizer of . Then, results in convex analysis |28l |32| imply that 0 must 
be an element of the sub-gradient of Equivalently, there exists a vector | G 9||p*^||i satisfying (4.301, 
where we use the expression of the projection V^. Equation (4.31) is just the restriction of equation (4.30) 
to the rows indexed by S. □ 

Lemma 4.6. Let p* be the minimizer of .^{p) over vectors supported in . If j satisfies 


v^IIp*^ -^td||i -h \\G^AV^d-d)\\^ < 7 


1 — max 




(4.33) 


with ^ as in Lemma j.5 p,, is the global minimizer of .^{p) over . 

Proof: To prove the lemma we show that any perturbation of p* by a vector that is not supported in 
^ leads to an increase of the objective function .if. This implies that p* is a local minimizer of .if in C^. 
That p* is the global minimizer follows from the convexity of .if. 

Consider an arbitrary vector v G and decompose it as 


V = u -I- w, 


suppuCo?^, suppw C 


(4.34) 


For small and positive e we have from definition (2.9) and the disjoint support of u and w that 
.if(p* -f ev) - .if(p*-l-eu) = e real(^ (^p* - d,0w) ^-l- 7 ||w||i -I- ^||0w||2 -l-real(^ (^u,^w) j , (4.35) 
with the first term dominating the second for e <C 1. We write it in terms of the components Wj of w as 
{Gp*- d,Gvr) = {G,y.p^,^ - d,gj)wi= Y iS^P*y -V^d,gj)wj - Y (d - d, gj) (4.36) 

^6.5^“ 


where we used that p* is supported in We estimate next the two sums in the right hand side. 
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For the terms in the first sum we have 


-iP^d,gj) = {G^p^^ -V^d,Vs^j) + {G,y.p*,^ -V^d,{I -Vs)^j), (4.37) 

where Vg = Gs Gl is the orthogonal projection on the range of the full rank matrix Gst with pseudo-inverse 


= [SsSs 


-1 




(4.38) 


Substituting the expression of and in ( |4.37[ ) we get 

{G^P*^ - V^d, gj) = {G^G^ (p*^ - G^d), Glsj) + (g^ - Gld'j , (/ - Vs)gj) 

= -1 {is^Gl^j) + {G,^ol,{I- Vs)^j) , 


(4.39) 


with the second inequality following from Lemma 4.5 and notation a. = p*^ — Gl^d. Let us write explicitly 
the second term in (4.39) 


{G^a,{I -Vs)gj) = -'Ps)S]) 


Oil, 


for j G 


(4.40) 


ley’ 


Since I G y, we have by definition (4.27) that there exists q G S such that z; G Br(yq), and we can 
decompose g; in two parts: g|^ which is along g^ and g^ which is orthogonal to it, 


gl = gl + gl 


= {gq,gl)gq, gt=Sl-gl- 


Clearly g| is in the range of Gs! so it is orthogonal to (/ — 'Pg)gj, and the terms in equation (4.40) satisfy 


{gl,{I-Vs)gj} I = I {gt,{I -'Ps)Sj) I < II II 211 ( 7 - 755)85 II 2 < llg/'lb- 


Moreover, by Pythagora’s theorem 




= 1 - llg| II2 = 1 - I (g?: Si)\ < 1 - (1 - rf = 2 r - < 2 r, 


(4.41) 


(4.42) 


where the first inequality follows z; G Br{yq) i.e., ^{zi,Zq) = 1 — | (gg,gi) | < r. Gathering the results 
(4.39 )-(4.42[) and using the triangle inequality we get the following bound on the first sum in (4.36) 


ymaxj I-h v^IIp*^ 

L J EtX 


jeS' 


w 1- 


For the second sum in (4.36) we have 


I (d-75^d,gj)wj| < max| (d-75^d,g;) I kil = ll^^c(d - 75^,d)||oo||w||i, 

je^” jey’’’ 

and putting together the results ( |4.43[ )-(4.44 ) we obtain from (4.36) that 

I (^Ip* -d,^w) I < [7 max | (l5,5]gj) | -k v^||p*^ -^t^lli + ll^^c(d - 75^d)||c 


w h. 


(4.43) 


(4.44) 


(4.45) 


This estimate and the triangle inequality give that the e term in (4.35) is positive for 7 satisfying 
7 > 7max I {^s^Glgj) I -k v^IIp*^ - ^i;|;„d||i -k (d - 75^d)||oo, 


tThat Gg is full rank follows from the assumption T(y) < 1/2. Matrix is not full rank because its columns are associated 
to nearby points in the vicinity of the sources, as stated in ( |4.27[ |. 
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as assumed in the lemma. Then, the perturbation of p* by the arbitrary vector (4.34) increases the objective 
function ^ and the lemma follows. □ 

To complete the proof of the theorem it remains to show that we can find a positive 7 as in Lemma |4.6| 
We begin with the estimate 




< maxJ||J|,^| 




ll,l I 






(4.46) 


where we used (4.38) and definition (4.29) of the sub-gradient of the ii norm. Now since j G Zj is 
outside every ball of radius r centered at a source or, equivalently, | (gg, g^) | < 1 — r, for all q G S. This and 
the definition of I{y) give 


ll^f gj'lli = 1 1^1- '^+Ay), 

q^S 

Moreover, for any vector u supported on S we have 

q&S jGS 

= E 


V j G 


(4.47) 


qGS 


q+ E (Sq,Sj)' 
i^S,j^q 


9G5 j&SJ^q 


> 






so the operator norm of the inverse of which exists because Qg is full rank, satisfies 


{SsSs 


, -1 


< 


1,1 


1-1(3^) 


5 
n -1 


Putting together (4.46 1 -( 4.481 we obtain that 


‘iDy’ < 1 . 


(4.48) 


(4.49) 


where the second inequality is by the assumption of the theorem that r > 2T{y). This shows that the right 
hand side in equation (4.33) in Lemma 4.6 is positive. 


Finally, we show that the left hand side in equation (4.33) is bounded independent of 7 . Clearly, the 
term ||0^„('P^d — d)||oo does not depend on 7 . It is due to the modeling error that is small when the grid 
is fine enough and the additi ve noise, that may cause d to lie outside the range of G^ i.e., d ^ To 

bound the first term in (4.33) note that u = Gl,d is the minimum £2 norm solution of G^u = d if it exists, 
or otherwise the minimizer of the least squares misfit ||^^u— d|| 2 . Then, since minimizes we have 

= ^\\G,y.p^g, - dlla+7llP*.rlli <-^.s»(u) = ^||^l^u-d||2+7||u||i, 


and using that \\Gg,u - d|j 2 < \\GjyP*g, - d|| 2 , we get ||p*^||i < ||u||i. Consequently, 

\\p*g, -G^dWi = |jp*^ -u|li < ||p*^|ji-k ||u||i =2||ui||, 


(4.50) 


independent of 7 . We conclude that we can find 7 > 0 as in Lemma |4.6[ and therefore complete the proof 
of the theorem. □ 
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5 . Summary. We presented a resolution study of sensor array imaging of a sparse scene of point sources 
or scatterers. The setup is in the paraxial regime, where the array aperture is small with respect to the 
distance to the imaging region. The imaging is done with two sparsity promoting optimization methods: 
l\ optimization (basis pursuit) and £i—penalty. The latter deals with noise and modeling errors. Our 
resolution analysis takes into account the sparse support of the unknowns. In case that they lie on the 
imaging grid, we obtained conditions on the grid size that guarantee their exact and unique recovery for 
noiseless data. This is for both single frequency and broad-band regimes, and the results show the benefit 
of having multiple frequency measurements. In case that the unknowns lie off-grid, we studied imaging on 
fine grids that mitigate the modeling error. We showed that when the unknowns are located at sufficiently 
far apart points in the scene, or they lie in well separated clusters, the results of imaging with sparsity 
promoting optimization are useful. The support of the reconstruction is near that of the unknowns and its 
locally averaged amplitudes approximates the true ones. 

Acknowledgments. This work was partially supported by AFOSR Grant FA9550-I5-1-0118. LB also 
acknowledges support from the ONR Grant N00014-I4-1-0077. 


Appendix A. Numerical setup. The simulations are for an aperture a = 25A, range L = lOOOA. We 
used different sizes of imaging grids such as Wm = 10 x 10 x 20 or Wm = 5 x 64 x 64, for a given mesh size 
h= (/i,/i,/i3). 

The ^l optimization is solved with the package [TB]. For noisy data we solve (2.9) with 7 chosen to 
be close to the noise level. We find the results to be very similar to those obtained from the constrained 
optimization 


min IIpIIi s.t. Il^p —d|| 2 < noise level. 
pec« 


Because the simulations give only an approximation p* of the minimizer p*, we threshold the results at 1% 
of the maximum entry in absolute value. We say that p is recovered numerically if ||p* — p||oo/|ip||oo < !%■ 

Appendix B. Derivation of the paraxial model. We have for x = (x, 0) in the array and z = (z, Z3) 
in the imaging region that 


|x-y| = L 




(B.l) 


so we can approximate the geometrical spreading factor in the Green’s function by 

1 _ 1 
47r|x —y| 47rL 


(B.2) 


The phase is given by 


with remainder 


fc|x- y| = kzs. + -f £:(x,y), 

22:3 


(B.3) 


■^^(x,y) = - 


/c|x — 2 

8 zi 


+ 0 




The scaling assumptions (3.3) make most of the terms in £ negligible, except for those that are independent 

tio 

/c|x|^ 


of z, which cancel in the product of the Green’s functions in the left hand side of equation (3.5). Thus we 
write 




(B.4) 


We also have 


fc|x —zp fc|xp kx ■ z 


2zr 


203 


2^3 


k\z\‘ 

2x3 
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with the last term negligible by assumption (3.3). Simplifying further, we get 


z|2 _ A:|x| 2 fc|xpZ 3 fcx-z / D§\ 

2z3 2L 2L2 L \\L L'^) \ \L‘^ ) 


(B.5) 


with the remainder negligible by (3.3). 


Now let X = Xr and obtain from (B.2|-(B.5) that 


G{uJ,Xr,Zj)G{uJ,5Cr,Zg) 


pife(z3,3-Z3,,) ife|=c,.|2(^3_j-^3_,) 

-P 2L^ L 

(47rL)2 


Equation (3.5) follows by summing over r. □ 

Appendix C. Derivation of the broad-band paraxial model. Using the parabolic scaling, we find 
as in appendix [B] that 

„ (^3,q'~.^3,d ik^\xr\'^(z^q-z^i') i fc ^ x^. ■ (zg - zp 

G{ujj,ir,Zq)G{ujj,Xr,Zi) « -® ^ ^ ’ 


(C.l) 


where kj = uijlc. Moreover, the phase $ in the right hand side of (C.l I is 
$ = 


, , . fcoixrl (2:3.9 - ^^ 3 ,/) koVir-iZg-zi) (BaD\ /Ba^D 3 \ 

- Z3.z) - ---- y - + 0[y^) + 0[y^) , 

with negligible remainder by assumption (3.15). Result (3.17) follows after substituting the approximation 
of d* in (C.l I, multiplying with the Gaussian pulse, and summing over the frequencies and the receivers. We 
also remove the phase ko{z^,q — z^^i) in equation (3.17). □ 


Appendix D. Proof of estimate 3.25[ Simplifying notation 


as 


P-V ^ a ^ n 


P + V , r- 


and choosing a contour defined by the line segments at angle 0 and 7r/4 with the real axis, and the circular 
arcs at radius rg and ri, we obtain 


dte** =- 


t/4 

dd 


sin{29)+i[e+rl cos(26()] _ j.gg-’'o sin{2e)+i[e+rl cos(2e)] 


+ Vi dr( 


(D.l) 


For the last term we have the estimate 


rV9 




dse-^" = 

2 


(D.2) 


where we changed variables as r = Tq + s. Moreover, 

t/4 


[ ^d0roe-^Vi’^(‘^e)+^[e+rlcos(2e}] < f ^=—(l - (D.3) 

Jo Jo 4ro V y 


and similar for the other integral over 0, because sin(20) > 46*/7r, for all 0 G (0,7r/4). The estimate (3.25) 
follows from (D.l)-(D.3) and the triangle inequality 


f 

V Vn 


dt( 


- 2 




< JL + ^g-o= < V + AV0 _ 

2 ro 2 a 

The second inequality is because ri > rg, the third inequality is by the definition of rg and the last inequality 
is because e~^ < 2/(y^a:) for any a; > 0. □ 
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